49 research outputs found

    Algoritmos de compressão sem perdas para imagens de microarrays e alinhamento de genomas completos

    Get PDF
    Doutoramento em InformáticaNowadays, in the 21st century, the never-ending expansion of information is a major global concern. The pace at which storage and communication resources are evolving is not fast enough to compensate this tendency. In order to overcome this issue, sophisticated and efficient compression tools are required. The goal of compression is to represent information with as few bits as possible. There are two kinds of compression, lossy and lossless. In lossless compression, information loss is not tolerated so the decoded information is exactly the same as the encoded one. On the other hand, in lossy compression some loss is acceptable. In this work we focused on lossless methods. The goal of this thesis was to create lossless compression tools that can be used in two types of data. The first type is known in the literature as microarray images. These images have 16 bits per pixel and a high spatial resolution. The other data type is commonly called Whole Genome Alignments (WGA), in particularly applied to MAF files. Regarding the microarray images, we improved existing microarray-specific methods by using some pre-processing techniques (segmentation and bitplane reduction). Moreover, we also developed a compression method based on pixel values estimates and a mixture of finite-context models. Furthermore, an approach based on binary-tree decomposition was also considered. Two compression tools were developed to compress MAF files. The first one based on a mixture of finite-context models and arithmetic coding, where only the DNA bases and alignment gaps were considered. The second tool, designated as MAFCO, is a complete compression tool that can handle all the information that can be found in MAF files. MAFCO relies on several finite-context models and allows parallel compression/decompression of MAF files.Hoje em dia, no século XXI, a expansão interminável de informação é uma grande preocupação mundial. O ritmo ao qual os recursos de armazenamento e comunicação estão a evoluir não é suficientemente rápido para compensar esta tendência. De forma a ultrapassar esta situação, são necessárias ferramentas de compressão sofisticadas e eficientes. A compressão consiste em representar informação utilizando a menor quantidade de bits possível. Existem dois tipos de compressão, com e sem perdas. Na compressão sem perdas, a perda de informação não é tolerada, por isso a informação descodificada é exatamente a mesma que a informação que foi codificada. Por outro lado, na compressão com perdas alguma perda é aceitável. Neste trabalho, focámo-nos apenas em métodos de compressão sem perdas. O objetivo desta tese consistiu na criação de ferramentas de compressão sem perdas para dois tipos de dados. O primeiro tipo de dados é conhecido na literatura como imagens de microarrays. Estas imagens têm 16 bits por píxel e uma resolução espacial elevada. O outro tipo de dados é geralmente denominado como alinhamento de genomas completos, particularmente aplicado a ficheiros MAF. Relativamente às imagens de microarrays, melhorámos alguns métodos de compressão específicos utilizando algumas técnicas de pré-processamento (segmentação e redução de planos binários). Além disso, desenvolvemos também um método de compressão baseado em estimação dos valores dos pixéis e em misturas de modelos de contexto-finito. Foi também considerada, uma abordagem baseada em decomposição em árvore binária. Foram desenvolvidas duas ferramentas de compressão para ficheiros MAF. A primeira ferramenta, é baseada numa mistura de modelos de contexto-finito e codificação aritmética, onde apenas as bases de ADN e os símbolos de alinhamento foram considerados. A segunda, designada como MAFCO, é uma ferramenta de compressão completa que consegue lidar com todo o tipo de informação que pode ser encontrada nos ficheiros MAF. MAFCO baseia-se em vários modelos de contexto-finito e permite compressão/descompressão paralela de ficheiros MAF

    Smart monitoring of constructed wetlands to improve efficiency and water quality

    Get PDF
    The Smart monitoring of constructed wetlands to improve efficiency and water quality (SmarterCW) project aims to monitor biological wastewater treatment processes by gathering continuous data from remote water and environmental sensors. The acquired data can be processed and analysed through data science tools to better understand the complex and coupled phenomena underneath wastewater treatment, as well as, to monitor and optimize the system performance. The results will improve the efficiency and control of nature-based wastewater treatment technologies. The methodology comprises the following tasks and activities: Implementation of a set of electrochemical sensors in the input and output flow streams of pilot-scale constructed wetlands; Acquisition of water quality parameters such as pH, electrical conductivity, temperature, and ionic compounds; Acquisition of environmental parameters, such as temperature and humidity; Application of data analysis tools to design and optimize conceptual models to correlate pollutants removal with operative parameters in green technologies for wastewater treatment. This methodology was applied to a patent-protected pilot-scale modular constructed wetland in which filling media consists of a mixture of solid waste. The system is complemented by a high-level IoT communication layer structure to support remote real-time water and environmental monitoring, system performance, and data dissemination. The project contributes to: Water and Environment through the efficient management and use of water resources and waste reduction, management, treatment, and valorisation; Materials and raw-materials through efficient, secure, and sustainable use of resources; and Environmental Education promoting environmental awareness and best environmental practices through the dissemination of scientific data and results using Information and Communication Technologies (ICT) tools and IoT platforms. The project also contributes to giving response to Societal Challenges, such as Environment protection, sustainable management of natural resources, water, biodiversity, and ecosystems; Enabling the transition to a green society and economy through eco-innovation.info:eu-repo/semantics/submittedVersio

    A practical clinical score

    Get PDF
    Copyright © 2022 Sociedade Portuguesa de Cardiologia. Publicado por Elsevier España, S.L.U. All rights reserved.INTRODUCTION AND OBJECTIVES: Obstructive coronary artery disease (CAD) remains the most common etiology of heart failure with reduced ejection fraction (HFrEF). However, there is controversy whether invasive coronary angiography (ICA) should be used initially to exclude CAD in patients presenting with new-onset HFrEF of unknown etiology. Our study aimed to develop a clinical score to quantify the risk of obstructive CAD in these patients. METHODS: We performed a cross-sectional observational study of 452 consecutive patients presenting with new-onset HFrEF of unknown etiology undergoing elective ICA in one academic center, between January 2005 and December 2019. Independent predictors for obstructive CAD were identified. A risk score was developed using multivariate logistic regression of designated variables. The accuracy and discriminative power of the predictive model were assessed. RESULTS: A total of 109 patients (24.1%) presented obstructive CAD. Six independent predictors were identified and included in the score: male gender (2 points), diabetes (1 point), dyslipidemia (1 point), smoking (1 point), peripheral arterial disease (1 point), and regional wall motion abnormalities (3 points). Patients with a score ≤3 had less than 15% predicted probability of obstructive CAD. Our score showed good discriminative power (C-statistic 0.872; 95% CI 0.834-0.909: p<0.001) and calibration (p=0.333 from the goodness-of-fit test). CONCLUSIONS: A simple clinical score showed the ability to predict the risk of obstructive CAD in patients presenting with new-onset HFrEF of unknown etiology and may guide the clinician in selecting the most appropriate diagnostic modality for the assessment of obstructive CAD.proofepub_ahead_of_prin

    Arbustus unedo essence: morphological and genetic characterization of the strawberry tree of Castelo de Paiva

    Get PDF
    O medronheiro é um arbusto da região mediterrânica que pode ser encontrada por todo o país. Ao contrário do que verifica na região sul do país, no concelho de Castelo de Paiva é atribuída uma reduzida importância económica a esta espécie. Com o intuito de preservar e potenciar a produção desta espécie e contribuir para a dinamização da economia do concelho, procedeu-se à caracterização morfológica e genética de uma amostra da população de medronheiros de Castelo de Paiva. A caracterização morfológica e genética foi realizada para um total de 10 genótipos. Para tal recolheram-se 70 folhas aleatoriamente em cada árvore. Em 40 folhas mediu-se o comprimento, largura, comprimento do pedúnculo, peso fresco, peso seco e determinou-se a área foliar. Dos caracteres morfológicos analisados, aqueles que se revelaram mais úteis na distinção dos vários genótipos foram: comprimento do pedúnculo, peso fresco e peso seco. As restantes 30 folhas foram utilizadas para a caracterização genética. Esta caracterização foi realizada recorrendo a um marcador de DNA, ISSR. Os 5 primeiros exemplaresutilizados na técnica de ISSR demonstraram-se polimórficos. Os resultados da caracterização genética sugerem que a variabilidade genética na população é média a alta.The strawberry tree is a shrub native in the Mediterranean region and it can be found throughout Portugal. Unlike the case in the southern region of the country, in Castelo de Paiva a minor economic importance is given to this species. In order to preserve, to enhance the production of this species and to contribute to the boosting of the economy of the region, we proceeded to the characterization of a small sample population of this fruit tree of Castelo de Paiva in what concerns to its morphology and genetics. The morphological and genetic characterization was performed for a total of 10 genotypes. For this, 70 leaves were randomly collected from each tree. For 40 leaves, it was measured the length, the width, the peduncle length, the wet weight, the dry weight and determined the leaf area. Of the morphological characteristics analyzed, the ones that proved most useful in distinguishing the various genotypes were: the length peduncle, the wet weight and the dry weight. The remaining 30 leaves were used in the genetic characterization. This characterization was performed using a DNA marker, the ISSR. The 5 primers used in the ISSR technique proved to be polymorphic. The results from the genetic characterization suggest that variability in population genetics is medium to high

    SARS-CoV-2 introductions and early dynamics of the epidemic in Portugal

    Get PDF
    Genomic surveillance of SARS-CoV-2 in Portugal was rapidly implemented by the National Institute of Health in the early stages of the COVID-19 epidemic, in collaboration with more than 50 laboratories distributed nationwide. Methods By applying recent phylodynamic models that allow integration of individual-based travel history, we reconstructed and characterized the spatio-temporal dynamics of SARSCoV-2 introductions and early dissemination in Portugal. Results We detected at least 277 independent SARS-CoV-2 introductions, mostly from European countries (namely the United Kingdom, Spain, France, Italy, and Switzerland), which were consistent with the countries with the highest connectivity with Portugal. Although most introductions were estimated to have occurred during early March 2020, it is likely that SARS-CoV-2 was silently circulating in Portugal throughout February, before the first cases were confirmed. Conclusions Here we conclude that the earlier implementation of measures could have minimized the number of introductions and subsequent virus expansion in Portugal. This study lays the foundation for genomic epidemiology of SARS-CoV-2 in Portugal, and highlights the need for systematic and geographically-representative genomic surveillance.We gratefully acknowledge to Sara Hill and Nuno Faria (University of Oxford) and Joshua Quick and Nick Loman (University of Birmingham) for kindly providing us with the initial sets of Artic Network primers for NGS; Rafael Mamede (MRamirez team, IMM, Lisbon) for developing and sharing a bioinformatics script for sequence curation (https://github.com/rfm-targa/BioinfUtils); Philippe Lemey (KU Leuven) for providing guidance on the implementation of the phylodynamic models; Joshua L. Cherry (National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health) for providing guidance with the subsampling strategies; and all authors, originating and submitting laboratories who have contributed genome data on GISAID (https://www.gisaid.org/) on which part of this research is based. The opinions expressed in this article are those of the authors and do not reflect the view of the National Institutes of Health, the Department of Health and Human Services, or the United States government. This study is co-funded by Fundação para a Ciência e Tecnologia and Agência de Investigação Clínica e Inovação Biomédica (234_596874175) on behalf of the Research 4 COVID-19 call. Some infrastructural resources used in this study come from the GenomePT project (POCI-01-0145-FEDER-022184), supported by COMPETE 2020 - Operational Programme for Competitiveness and Internationalisation (POCI), Lisboa Portugal Regional Operational Programme (Lisboa2020), Algarve Portugal Regional Operational Programme (CRESC Algarve2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF), and by Fundação para a Ciência e a Tecnologia (FCT).info:eu-repo/semantics/publishedVersio

    Geographic patterns of tree dispersal modes in Amazonia and their ecological correlates

    Get PDF
    Unidad de excelencia María de Maeztu CEX2019-000940-MAim: To investigate the geographic patterns and ecological correlates in the geographic distribution of the most common tree dispersal modes in Amazonia (endozoochory, synzoochory, anemochory and hydrochory). We examined if the proportional abundance of these dispersal modes could be explained by the availability of dispersal agents (disperser-availability hypothesis) and/or the availability of resources for constructing zoochorous fruits (resource-availability hypothesis). Time period: Tree-inventory plots established between 1934 and 2019. Major taxa studied: Trees with a diameter at breast height (DBH) ≥ 9.55 cm. Location: Amazonia, here defined as the lowland rain forests of the Amazon River basin and the Guiana Shield. Methods: We assigned dispersal modes to a total of 5433 species and morphospecies within 1877 tree-inventory plots across terra-firme, seasonally flooded, and permanently flooded forests. We investigated geographic patterns in the proportional abundance of dispersal modes. We performed an abundance-weighted mean pairwise distance (MPD) test and fit generalized linear models (GLMs) to explain the geographic distribution of dispersal modes. Results: Anemochory was significantly, positively associated with mean annual wind speed, and hydrochory was significantly higher in flooded forests. Dispersal modes did not consistently show significant associations with the availability of resources for constructing zoochorous fruits. A lower dissimilarity in dispersal modes, resulting from a higher dominance of endozoochory, occurred in terra-firme forests (excluding podzols) compared to flooded forests. Main conclusions: The disperser-availability hypothesis was well supported for abiotic dispersal modes (anemochory and hydrochory). The availability of resources for constructing zoochorous fruits seems an unlikely explanation for the distribution of dispersal modes in Amazonia. The association between frugivores and the proportional abundance of zoochory requires further research, as tree recruitment not only depends on dispersal vectors but also on conditions that favour or limit seedling recruitment across forest types

    Pervasive gaps in Amazonian ecological research

    Get PDF

    Pervasive gaps in Amazonian ecological research

    Get PDF
    Biodiversity loss is one of the main challenges of our time,1,2 and attempts to address it require a clear un derstanding of how ecological communities respond to environmental change across time and space.3,4 While the increasing availability of global databases on ecological communities has advanced our knowledge of biodiversity sensitivity to environmental changes,5–7 vast areas of the tropics remain understudied.8–11 In the American tropics, Amazonia stands out as the world’s most diverse rainforest and the primary source of Neotropical biodiversity,12 but it remains among the least known forests in America and is often underrepre sented in biodiversity databases.13–15 To worsen this situation, human-induced modifications16,17 may elim inate pieces of the Amazon’s biodiversity puzzle before we can use them to understand how ecological com munities are responding. To increase generalization and applicability of biodiversity knowledge,18,19 it is thus crucial to reduce biases in ecological research, particularly in regions projected to face the most pronounced environmental changes. We integrate ecological community metadata of 7,694 sampling sites for multiple or ganism groups in a machine learning model framework to map the research probability across the Brazilian Amazonia, while identifying the region’s vulnerability to environmental change. 15%–18% of the most ne glected areas in ecological research are expected to experience severe climate or land use changes by 2050. This means that unless we take immediate action, we will not be able to establish their current status, much less monitor how it is changing and what is being lostinfo:eu-repo/semantics/publishedVersio

    Geographic patterns of tree dispersal modes in Amazonia and their ecological correlates

    Get PDF
    Aim: To investigate the geographic patterns and ecological correlates in the geographic distribution of the most common tree dispersal modes in Amazonia (endozoochory, synzoochory, anemochory and hydrochory). We examined if the proportional abundance of these dispersal modes could be explained by the availability of dispersal agents (disperser-availability hypothesis) and/or the availability of resources for constructing zoochorous fruits (resource-availability hypothesis). Time period: Tree-inventory plots established between 1934 and 2019. Major taxa studied: Trees with a diameter at breast height (DBH) ≥ 9.55 cm. Location: Amazonia, here defined as the lowland rain forests of the Amazon River basin and the Guiana Shield. Methods: We assigned dispersal modes to a total of 5433 species and morphospecies within 1877 tree-inventory plots across terra-firme, seasonally flooded, and permanently flooded forests. We investigated geographic patterns in the proportional abundance of dispersal modes. We performed an abundance-weighted mean pairwise distance (MPD) test and fit generalized linear models (GLMs) to explain the geographic distribution of dispersal modes. Results: Anemochory was significantly, positively associated with mean annual wind speed, and hydrochory was significantly higher in flooded forests. Dispersal modes did not consistently show significant associations with the availability of resources for constructing zoochorous fruits. A lower dissimilarity in dispersal modes, resulting from a higher dominance of endozoochory, occurred in terra-firme forests (excluding podzols) compared to flooded forests. Main conclusions: The disperser-availability hypothesis was well supported for abiotic dispersal modes (anemochory and hydrochory). The availability of resources for constructing zoochorous fruits seems an unlikely explanation for the distribution of dispersal modes in Amazonia. The association between frugivores and the proportional abundance of zoochory requires further research, as tree recruitment not only depends on dispersal vectors but also on conditions that favour or limit seedling recruitment across forest types

    Mapping density, diversity and species-richness of the Amazon tree flora

    Get PDF
    Using 2.046 botanically-inventoried tree plots across the largest tropical forest on Earth, we mapped tree species-diversity and tree species-richness at 0.1-degree resolution, and investigated drivers for diversity and richness. Using only location, stratified by forest type, as predictor, our spatial model, to the best of our knowledge, provides the most accurate map of tree diversity in Amazonia to date, explaining approximately 70% of the tree diversity and species-richness. Large soil-forest combinations determine a significant percentage of the variation in tree species-richness and tree alpha-diversity in Amazonian forest-plots. We suggest that the size and fragmentation of these systems drive their large-scale diversity patterns and hence local diversity. A model not using location but cumulative water deficit, tree density, and temperature seasonality explains 47% of the tree species-richness in the terra-firme forest in Amazonia. Over large areas across Amazonia, residuals of this relationship are small and poorly spatially structured, suggesting that much of the residual variation may be local. The Guyana Shield area has consistently negative residuals, showing that this area has lower tree species-richness than expected by our models. We provide extensive plot meta-data, including tree density, tree alpha-diversity and tree species-richness results and gridded maps at 0.1-degree resolution
    corecore